sentence representation
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (21 more...)
- Government (0.67)
- Law (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- North America > United States > New York (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
- Health & Medicine (0.46)
- Government (0.46)
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > New York (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
- Health & Medicine (0.46)
- Government (0.46)
Patent Representation Learning via Self-supervision
Zuo, You, Gerdes, Kim, de La Clergerie, Eric Villemonte, Sagot, Benoît
This paper presents a simple yet effective contrastive learning framework for learning patent embeddings by leveraging multiple views from within the same document. We first identify a patent-specific failure mode of SimCSE style dropout augmentation: it produces overly uniform embeddings that lose semantic cohesion. To remedy this, we propose section-based augmentation, where different sections of a patent (e.g., abstract, claims, background) serve as complementary views. This design introduces natural semantic and structural diversity, mitigating over-dispersion and yielding embeddings that better preserve both global structure and local continuity. On large-scale benchmarks, our fully self-supervised method matches or surpasses citation-and IPC-supervised baselines in prior-art retrieval and classification, while avoiding reliance on brittle or incomplete annotations. Our analysis further shows that different sections specialize for different tasks-claims and summaries benefit retrieval, while background sections aid classification-highlighting the value of patents' inherent discourse structure for representation learning. These results highlight the value of exploiting intra-document views for scalable and generalizable patent understanding.
- North America > United States (0.46)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Do Prompts Reshape Representations? An Empirical Study of Prompting Effects on Embeddings
Gonzalez-Gutierrez, Cesar, Hovy, Dirk
Prompting is a common approach for leveraging LMs in zero-shot settings. However, the underlying mechanisms that enable LMs to perform diverse tasks without task-specific supervision remain poorly understood. Studying the relationship between prompting and the quality of internal representations can shed light on how pre-trained embeddings may support in-context task solving. In this empirical study, we conduct a series of probing experiments on prompt embeddings, analyzing various combinations of prompt templates for zero-shot classification. Our findings show that while prompting affects the quality of representations, these changes do not consistently correlate with the relevance of the prompts to the target task. This result challenges the assumption that more relevant prompts necessarily lead to better representations. We further analyze potential factors that may contribute to this unexpected behavior.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (18 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.94)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (21 more...)
- Government (0.67)
- Law (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)